Overview
Brought to you by YData
Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 11123 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.0 MiB |
| Average record size in memory | 96.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 5 |
| Categorical | 1 |
ratings_count is highly overall correlated with text_reviews_count | High correlation |
text_reviews_count is highly overall correlated with ratings_count | High correlation |
language_code is highly imbalanced (76.6%) | Imbalance |
isbn13 is highly skewed (γ1 = -21.06647588) | Skewed |
bookID has unique values | Unique |
isbn has unique values | Unique |
isbn13 has unique values | Unique |
text_reviews_count has 624 (5.6%) zeros | Zeros |
Reproduction
| Analysis started | 2025-07-12 23:23:32.666516 |
|---|---|
| Analysis finished | 2025-07-12 23:23:40.188482 |
| Duration | 7.52 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
bookID
Real number (ℝ)
Unique 
| Distinct | 11123 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21310.857 |
| Minimum | 1 |
|---|---|
| Maximum | 45641 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1800.1 |
| Q1 | 10277.5 |
| median | 20287 |
| Q3 | 32104.5 |
| 95-th percentile | 43067.5 |
| Maximum | 45641 |
| Range | 45640 |
| Interquartile range (IQR) | 21827 |
Descriptive statistics
| Standard deviation | 13094.727 |
|---|---|
| Coefficient of variation (CV) | 0.61446273 |
| Kurtosis | -1.1465879 |
| Mean | 21310.857 |
| Median Absolute Deviation (MAD) | 10884 |
| Skewness | 0.14401023 |
| Sum | 2.3704066 × 108 |
| Variance | 1.7147188 × 108 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 45641 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 45585 | 1 | < 0.1% |
| 45583 | 1 | < 0.1% |
| 45574 | 1 | < 0.1% |
| 45572 | 1 | < 0.1% |
| 45570 | 1 | < 0.1% |
| 45568 | 1 | < 0.1% |
| 45564 | 1 | < 0.1% |
| Other values (11113) | 11113 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 |
| Value | Count | Frequency (%) |
| 45641 | 1 | |
| 45639 | 1 | |
| 45634 | 1 | |
| 45633 | 1 | |
| 45631 | 1 | |
| 45630 | 1 | |
| 45626 | 1 | |
| 45625 | 1 | |
| 45623 | 1 | |
| 45617 | 1 |
title
Text
| Distinct | 10348 |
|---|---|
| Distinct (%) | 93.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
Length
| Max length | 254 |
|---|---|
| Median length | 139 |
| Mean length | 35.844287 |
| Min length | 2 |
Unique
| Unique | 9861 ? |
|---|---|
| Unique (%) | 88.7% |
Sample
| 1st row | Harry Potter and the Half-Blood Prince (Harry Potter #6) |
|---|---|
| 2nd row | Harry Potter and the Order of the Phoenix (Harry Potter #5) |
| 3rd row | Harry Potter and the Chamber of Secrets (Harry Potter #2) |
| 4th row | Harry Potter and the Prisoner of Azkaban (Harry Potter #3) |
| 5th row | Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5) |
| Value | Count | Frequency (%) |
| the | 6688 | 10.1% |
| of | 3334 | 5.0% |
| and | 1651 | 2.5% |
| a | 1335 | 2.0% |
| 1 | 796 | 1.2% |
| in | 777 | 1.2% |
| to | 697 | 1.1% |
| 588 | 0.9% | |
| 2 | 519 | 0.8% |
| 3 | 399 | 0.6% |
| Other values (12079) | 49507 |
Most occurring characters
| Value | Count | Frequency (%) |
| 58852 | ||
| e | 36612 | 9.2% |
| o | 23573 | 5.9% |
| a | 22292 | 5.6% |
| i | 20601 | 5.2% |
| r | 20200 | 5.1% |
| n | 20016 | 5.0% |
| t | 19138 | 4.8% |
| s | 16684 | 4.2% |
| h | 13689 | 3.4% |
| Other values (157) | 147039 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 398696 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 58852 | ||
| e | 36612 | 9.2% |
| o | 23573 | 5.9% |
| a | 22292 | 5.6% |
| i | 20601 | 5.2% |
| r | 20200 | 5.1% |
| n | 20016 | 5.0% |
| t | 19138 | 4.8% |
| s | 16684 | 4.2% |
| h | 13689 | 3.4% |
| Other values (157) | 147039 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 398696 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 58852 | ||
| e | 36612 | 9.2% |
| o | 23573 | 5.9% |
| a | 22292 | 5.6% |
| i | 20601 | 5.2% |
| r | 20200 | 5.1% |
| n | 20016 | 5.0% |
| t | 19138 | 4.8% |
| s | 16684 | 4.2% |
| h | 13689 | 3.4% |
| Other values (157) | 147039 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 398696 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 58852 | ||
| e | 36612 | 9.2% |
| o | 23573 | 5.9% |
| a | 22292 | 5.6% |
| i | 20601 | 5.2% |
| r | 20200 | 5.1% |
| n | 20016 | 5.0% |
| t | 19138 | 4.8% |
| s | 16684 | 4.2% |
| h | 13689 | 3.4% |
| Other values (157) | 147039 |
authors
Text
| Distinct | 6639 |
|---|---|
| Distinct (%) | 59.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
Length
| Max length | 751 |
|---|---|
| Median length | 375 |
| Mean length | 24.823968 |
| Min length | 3 |
Unique
| Unique | 5278 ? |
|---|---|
| Unique (%) | 47.5% |
Sample
| 1st row | J.K. Rowling/Mary GrandPré |
|---|---|
| 2nd row | J.K. Rowling/Mary GrandPré |
| 3rd row | J.K. Rowling |
| 4th row | J.K. Rowling/Mary GrandPré |
| 5th row | J.K. Rowling/Mary GrandPré |
| Value | Count | Frequency (%) |
| john | 279 | 0.8% |
| william | 262 | 0.8% |
| james | 227 | 0.7% |
| david | 201 | 0.6% |
| a | 191 | 0.6% |
| robert | 185 | 0.5% |
| j | 181 | 0.5% |
| stephen | 176 | 0.5% |
| richard | 157 | 0.5% |
| m | 155 | 0.5% |
| Other values (12654) | 31794 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 23738 | 8.6% |
| 23397 | 8.5% | |
| a | 22538 | 8.2% |
| r | 18107 | 6.6% |
| n | 17411 | 6.3% |
| i | 15667 | 5.7% |
| o | 14404 | 5.2% |
| l | 13268 | 4.8% |
| s | 9813 | 3.6% |
| t | 9521 | 3.4% |
| Other values (138) | 108253 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 276117 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 23738 | 8.6% |
| 23397 | 8.5% | |
| a | 22538 | 8.2% |
| r | 18107 | 6.6% |
| n | 17411 | 6.3% |
| i | 15667 | 5.7% |
| o | 14404 | 5.2% |
| l | 13268 | 4.8% |
| s | 9813 | 3.6% |
| t | 9521 | 3.4% |
| Other values (138) | 108253 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 276117 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 23738 | 8.6% |
| 23397 | 8.5% | |
| a | 22538 | 8.2% |
| r | 18107 | 6.6% |
| n | 17411 | 6.3% |
| i | 15667 | 5.7% |
| o | 14404 | 5.2% |
| l | 13268 | 4.8% |
| s | 9813 | 3.6% |
| t | 9521 | 3.4% |
| Other values (138) | 108253 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 276117 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 23738 | 8.6% |
| 23397 | 8.5% | |
| a | 22538 | 8.2% |
| r | 18107 | 6.6% |
| n | 17411 | 6.3% |
| i | 15667 | 5.7% |
| o | 14404 | 5.2% |
| l | 13268 | 4.8% |
| s | 9813 | 3.6% |
| t | 9521 | 3.4% |
| Other values (138) | 108253 |
average_rating
Real number (ℝ)
| Distinct | 209 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9340753 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 25 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.44 |
| Q1 | 3.77 |
| median | 3.96 |
| Q3 | 4.14 |
| 95-th percentile | 4.38 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0.37 |
Descriptive statistics
| Standard deviation | 0.35048531 |
|---|---|
| Coefficient of variation (CV) | 0.089089629 |
| Kurtosis | 36.222806 |
| Mean | 3.9340753 |
| Median Absolute Deviation (MAD) | 0.18 |
| Skewness | -3.5774415 |
| Sum | 43758.72 |
| Variance | 0.12283995 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 219 | 2.0% |
| 3.96 | 195 | 1.8% |
| 4.02 | 178 | 1.6% |
| 3.94 | 176 | 1.6% |
| 4.07 | 172 | 1.5% |
| 3.93 | 168 | 1.5% |
| 4.05 | 168 | 1.5% |
| 3.92 | 168 | 1.5% |
| 3.83 | 166 | 1.5% |
| 3.89 | 166 | 1.5% |
| Other values (199) | 9347 |
| Value | Count | Frequency (%) |
| 0 | 25 | |
| 1 | 2 | < 0.1% |
| 1.67 | 1 | < 0.1% |
| 2 | 6 | 0.1% |
| 2.33 | 1 | < 0.1% |
| 2.4 | 1 | < 0.1% |
| 2.5 | 1 | < 0.1% |
| 2.55 | 1 | < 0.1% |
| 2.61 | 1 | < 0.1% |
| 2.62 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 22 | |
| 4.91 | 1 | < 0.1% |
| 4.88 | 1 | < 0.1% |
| 4.86 | 1 | < 0.1% |
| 4.83 | 1 | < 0.1% |
| 4.82 | 1 | < 0.1% |
| 4.8 | 1 | < 0.1% |
| 4.78 | 2 | < 0.1% |
| 4.76 | 1 | < 0.1% |
| 4.75 | 2 | < 0.1% |
isbn
Text
Unique 
| Distinct | 11123 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.9999101 |
| Min length | 9 |
Unique
| Unique | 11123 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 0439785960 |
|---|---|
| 2nd row | 0439358078 |
| 3rd row | 0439554896 |
| 4th row | 043965548X |
| 5th row | 0439682584 |
| Value | Count | Frequency (%) |
| 043965548x | 1 | < 0.1% |
| 8497646983 | 1 | < 0.1% |
| 0439785960 | 1 | < 0.1% |
| 8466302298 | 1 | < 0.1% |
| 8432203238 | 1 | < 0.1% |
| 9583006408 | 1 | < 0.1% |
| 0061199001 | 1 | < 0.1% |
| 972233168x | 1 | < 0.1% |
| 9722332201 | 1 | < 0.1% |
| 9722330551 | 1 | < 0.1% |
| Other values (11113) | 11113 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 19990 | |
| 1 | 12568 | |
| 4 | 11493 | |
| 5 | 10540 | |
| 3 | 10379 | |
| 2 | 9463 | |
| 7 | 9344 | |
| 8 | 9101 | |
| 6 | 9076 | |
| 9 | 8291 | |
| Other values (2) | 984 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 111229 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 19990 | |
| 1 | 12568 | |
| 4 | 11493 | |
| 5 | 10540 | |
| 3 | 10379 | |
| 2 | 9463 | |
| 7 | 9344 | |
| 8 | 9101 | |
| 6 | 9076 | |
| 9 | 8291 | |
| Other values (2) | 984 | 0.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 111229 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 19990 | |
| 1 | 12568 | |
| 4 | 11493 | |
| 5 | 10540 | |
| 3 | 10379 | |
| 2 | 9463 | |
| 7 | 9344 | |
| 8 | 9101 | |
| 6 | 9076 | |
| 9 | 8291 | |
| Other values (2) | 984 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 111229 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 19990 | |
| 1 | 12568 | |
| 4 | 11493 | |
| 5 | 10540 | |
| 3 | 10379 | |
| 2 | 9463 | |
| 7 | 9344 | |
| 8 | 9101 | |
| 6 | 9076 | |
| 9 | 8291 | |
| Other values (2) | 984 | 0.9% |
isbn13
Real number (ℝ)
Skewed  Unique 
| Distinct | 11123 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.7598802 × 1012 |
| Minimum | 8.9870598 × 109 |
|---|---|
| Maximum | 9.7900077 × 1012 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 8.9870598 × 109 |
|---|---|
| 5-th percentile | 9.7800609 × 1012 |
| Q1 | 9.7803455 × 1012 |
| median | 9.7805825 × 1012 |
| Q3 | 9.7808722 × 1012 |
| 95-th percentile | 9.7819322 × 1012 |
| Maximum | 9.7900077 × 1012 |
| Range | 9.7810206 × 1012 |
| Interquartile range (IQR) | 5.2675424 × 108 |
Descriptive statistics
| Standard deviation | 4.4297585 × 1011 |
|---|---|
| Coefficient of variation (CV) | 0.045387426 |
| Kurtosis | 442.47375 |
| Mean | 9.7598802 × 1012 |
| Median Absolute Deviation (MAD) | 2.5197677 × 108 |
| Skewness | -21.066476 |
| Sum | 1.0855915 × 1017 |
| Variance | 1.962276 × 1023 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 9.788497647 × 1012 | 1 | < 0.1% |
| 9.780439786 × 1012 | 1 | < 0.1% |
| 9.780439358 × 1012 | 1 | < 0.1% |
| 9.788432217 × 1012 | 1 | < 0.1% |
| 9.788466319 × 1012 | 1 | < 0.1% |
| 9.781932395 × 1012 | 1 | < 0.1% |
| 9.781855495 × 1012 | 1 | < 0.1% |
| 9.780573051 × 1012 | 1 | < 0.1% |
| 9.789681907 × 1012 | 1 | < 0.1% |
| 9.781568521 × 1012 | 1 | < 0.1% |
| Other values (11113) | 11113 |
| Value | Count | Frequency (%) |
| 8987059752 | 1 | |
| 2.004913 × 1010 | 1 | |
| 2.375500432 × 1010 | 1 | |
| 3.44060546 × 1010 | 1 | |
| 4.908600776 × 1010 | 1 | |
| 7.399914077 × 1010 | 1 | |
| 7.399925491 × 1010 | 1 | |
| 7.399976844 × 1010 | 1 | |
| 7.399996082 × 1010 | 1 | |
| 7.609202599 × 1010 | 1 |
| Value | Count | Frequency (%) |
| 9.790007672 × 1012 | 1 | |
| 9.789998692 × 1012 | 1 | |
| 9.789879398 × 1012 | 1 | |
| 9.789875801 × 1012 | 1 | |
| 9.789875662 × 1012 | 1 | |
| 9.78987225 × 1012 | 1 | |
| 9.789861157 × 1012 | 1 | |
| 9.789861157 × 1012 | 1 | |
| 9.789861146 × 1012 | 1 | |
| 9.789861146 × 1012 | 1 |
language_code
Categorical
Imbalance 
| Distinct | 27 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
| eng | |
|---|---|
| en-US | |
| spa | 218 |
| en-GB | 214 |
| fre | 144 |
| Other values (22) | 231 |
Length
| Max length | 5 |
|---|---|
| Median length | 3 |
| Mean length | 3.2928167 |
| Min length | 2 |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | eng |
|---|---|
| 2nd row | eng |
| 3rd row | eng |
| 4th row | eng |
| 5th row | eng |
Common Values
| Value | Count | Frequency (%) |
| eng | 8908 | |
| en-US | 1408 | 12.7% |
| spa | 218 | 2.0% |
| en-GB | 214 | 1.9% |
| fre | 144 | 1.3% |
| ger | 99 | 0.9% |
| jpn | 46 | 0.4% |
| mul | 19 | 0.2% |
| zho | 14 | 0.1% |
| grc | 11 | 0.1% |
| Other values (17) | 42 | 0.4% |
Length
| Value | Count | Frequency (%) |
| eng | 8908 | |
| en-us | 1408 | 12.7% |
| spa | 218 | 2.0% |
| en-gb | 214 | 1.9% |
| fre | 144 | 1.3% |
| ger | 99 | 0.9% |
| jpn | 46 | 0.4% |
| mul | 19 | 0.2% |
| zho | 14 | 0.1% |
| grc | 11 | 0.1% |
| Other values (17) | 42 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 10787 | |
| n | 10588 | |
| g | 9021 | |
| - | 1629 | 4.4% |
| U | 1408 | 3.8% |
| S | 1408 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 36626 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 10787 | |
| n | 10588 | |
| g | 9021 | |
| - | 1629 | 4.4% |
| U | 1408 | 3.8% |
| S | 1408 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 36626 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 10787 | |
| n | 10588 | |
| g | 9021 | |
| - | 1629 | 4.4% |
| U | 1408 | 3.8% |
| S | 1408 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 36626 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 10787 | |
| n | 10588 | |
| g | 9021 | |
| - | 1629 | 4.4% |
| U | 1408 | 3.8% |
| S | 1408 | 3.8% |
| p | 275 | 0.8% |
| r | 270 | 0.7% |
| a | 231 | 0.6% |
| s | 224 | 0.6% |
| Other values (16) | 785 | 2.1% |
num_pages
Real number (ℝ)
| Distinct | 997 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 336.40556 |
| Minimum | 0 |
|---|---|
| Maximum | 6576 |
| Zeros | 76 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 48 |
| Q1 | 192 |
| median | 299 |
| Q3 | 416 |
| 95-th percentile | 752 |
| Maximum | 6576 |
| Range | 6576 |
| Interquartile range (IQR) | 224 |
Descriptive statistics
| Standard deviation | 241.15263 |
|---|---|
| Coefficient of variation (CV) | 0.7168509 |
| Kurtosis | 62.415973 |
| Mean | 336.40556 |
| Median Absolute Deviation (MAD) | 107 |
| Skewness | 4.2717781 |
| Sum | 3741839 |
| Variance | 58154.589 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 288 | 230 | 2.1% |
| 192 | 221 | 2.0% |
| 320 | 218 | 2.0% |
| 256 | 207 | 1.9% |
| 352 | 202 | 1.8% |
| 224 | 198 | 1.8% |
| 208 | 178 | 1.6% |
| 304 | 177 | 1.6% |
| 240 | 173 | 1.6% |
| 384 | 172 | 1.5% |
| Other values (987) | 9147 |
| Value | Count | Frequency (%) |
| 0 | 76 | |
| 1 | 11 | 0.1% |
| 2 | 15 | 0.1% |
| 3 | 19 | 0.2% |
| 4 | 11 | 0.1% |
| 5 | 16 | 0.1% |
| 6 | 20 | 0.2% |
| 7 | 6 | 0.1% |
| 8 | 10 | 0.1% |
| 9 | 11 | 0.1% |
| Value | Count | Frequency (%) |
| 6576 | 1 | |
| 4736 | 1 | |
| 3400 | 1 | |
| 3342 | 1 | |
| 3020 | 1 | |
| 2751 | 1 | |
| 2690 | 1 | |
| 2480 | 1 | |
| 2264 | 1 | |
| 2198 | 1 |
ratings_count
Real number (ℝ)
High correlation 
| Distinct | 5294 |
|---|---|
| Distinct (%) | 47.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17942.848 |
| Minimum | 0 |
|---|---|
| Maximum | 4597666 |
| Zeros | 80 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 104 |
| median | 745 |
| Q3 | 5000.5 |
| 95-th percentile | 61114 |
| Maximum | 4597666 |
| Range | 4597666 |
| Interquartile range (IQR) | 4896.5 |
Descriptive statistics
| Standard deviation | 112499.15 |
|---|---|
| Coefficient of variation (CV) | 6.2698605 |
| Kurtosis | 442.27167 |
| Mean | 17942.848 |
| Median Absolute Deviation (MAD) | 728 |
| Skewness | 17.693952 |
| Sum | 1.995783 × 108 |
| Variance | 1.265606 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 82 | 0.7% |
| 0 | 80 | 0.7% |
| 1 | 76 | 0.7% |
| 2 | 71 | 0.6% |
| 4 | 71 | 0.6% |
| 5 | 61 | 0.5% |
| 9 | 60 | 0.5% |
| 8 | 59 | 0.5% |
| 6 | 57 | 0.5% |
| 7 | 56 | 0.5% |
| Other values (5284) | 10450 |
| Value | Count | Frequency (%) |
| 0 | 80 | |
| 1 | 76 | |
| 2 | 71 | |
| 3 | 82 | |
| 4 | 71 | |
| 5 | 61 | |
| 6 | 57 | |
| 7 | 56 | |
| 8 | 59 | |
| 9 | 60 |
| Value | Count | Frequency (%) |
| 4597666 | 1 | |
| 2530894 | 1 | |
| 2457092 | 1 | |
| 2418736 | 1 | |
| 2339585 | 1 | |
| 2293963 | 1 | |
| 2153167 | 1 | |
| 2128944 | 1 | |
| 2111750 | 1 | |
| 2095690 | 1 |
text_reviews_count
Real number (ℝ)
High correlation  Zeros 
| Distinct | 1822 |
|---|---|
| Distinct (%) | 16.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 542.0481 |
| Minimum | 0 |
|---|---|
| Maximum | 94265 |
| Zeros | 624 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 87.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 9 |
| median | 47 |
| Q3 | 238 |
| 95-th percentile | 2158.9 |
| Maximum | 94265 |
| Range | 94265 |
| Interquartile range (IQR) | 229 |
Descriptive statistics
| Standard deviation | 2576.6196 |
|---|---|
| Coefficient of variation (CV) | 4.7534888 |
| Kurtosis | 396.56506 |
| Mean | 542.0481 |
| Median Absolute Deviation (MAD) | 45 |
| Skewness | 16.175096 |
| Sum | 6029201 |
| Variance | 6638968.5 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 624 | 5.6% |
| 1 | 458 | 4.1% |
| 2 | 354 | 3.2% |
| 3 | 263 | 2.4% |
| 4 | 247 | 2.2% |
| 5 | 223 | 2.0% |
| 6 | 199 | 1.8% |
| 7 | 180 | 1.6% |
| 9 | 164 | 1.5% |
| 8 | 162 | 1.5% |
| Other values (1812) | 8249 |
| Value | Count | Frequency (%) |
| 0 | 624 | |
| 1 | 458 | |
| 2 | 354 | |
| 3 | 263 | |
| 4 | 247 | 2.2% |
| 5 | 223 | 2.0% |
| 6 | 199 | 1.8% |
| 7 | 180 | 1.6% |
| 8 | 162 | 1.5% |
| 9 | 164 | 1.5% |
| Value | Count | Frequency (%) |
| 94265 | 1 | |
| 86881 | 1 | |
| 56604 | 1 | |
| 55843 | 1 | |
| 52759 | 1 | |
| 47951 | 1 | |
| 47620 | 1 | |
| 46176 | 1 | |
| 43499 | 1 | |
| 36325 | 1 |
publication_date
Text
| Distinct | 3679 |
|---|---|
| Distinct (%) | 33.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.7239953 |
| Min length | 8 |
Unique
| Unique | 2022 ? |
|---|---|
| Unique (%) | 18.2% |
Sample
| 1st row | 9/16/2006 |
|---|---|
| 2nd row | 9/1/2004 |
| 3rd row | 11/1/2003 |
| 4th row | 5/1/2004 |
| 5th row | 9/13/2004 |
| Value | Count | Frequency (%) |
| 10/1/2005 | 56 | 0.5% |
| 11/1/2005 | 53 | 0.5% |
| 9/1/2006 | 51 | 0.5% |
| 10/1/2006 | 48 | 0.4% |
| 11/1/2006 | 40 | 0.4% |
| 8/1/2006 | 39 | 0.4% |
| 7/1/2004 | 39 | 0.4% |
| 8/1/2005 | 37 | 0.3% |
| 7/1/2003 | 37 | 0.3% |
| 10/1/2004 | 37 | 0.3% |
| Other values (3669) | 10686 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 22246 | |
| 0 | 17995 | |
| 1 | 15687 | |
| 2 | 13215 | |
| 9 | 8399 | 8.7% |
| 6 | 3779 | 3.9% |
| 5 | 3681 | 3.8% |
| 3 | 3359 | 3.5% |
| 4 | 3080 | 3.2% |
| 7 | 2824 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 97037 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| / | 22246 | |
| 0 | 17995 | |
| 1 | 15687 | |
| 2 | 13215 | |
| 9 | 8399 | 8.7% |
| 6 | 3779 | 3.9% |
| 5 | 3681 | 3.8% |
| 3 | 3359 | 3.5% |
| 4 | 3080 | 3.2% |
| 7 | 2824 | 2.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 97037 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| / | 22246 | |
| 0 | 17995 | |
| 1 | 15687 | |
| 2 | 13215 | |
| 9 | 8399 | 8.7% |
| 6 | 3779 | 3.9% |
| 5 | 3681 | 3.8% |
| 3 | 3359 | 3.5% |
| 4 | 3080 | 3.2% |
| 7 | 2824 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 97037 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| / | 22246 | |
| 0 | 17995 | |
| 1 | 15687 | |
| 2 | 13215 | |
| 9 | 8399 | 8.7% |
| 6 | 3779 | 3.9% |
| 5 | 3681 | 3.8% |
| 3 | 3359 | 3.5% |
| 4 | 3080 | 3.2% |
| 7 | 2824 | 2.9% |
publisher
Text
| Distinct | 2290 |
|---|---|
| Distinct (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 87.0 KiB |
Length
| Max length | 67 |
|---|---|
| Median length | 53 |
| Mean length | 15.226647 |
| Min length | 2 |
Unique
| Unique | 1295 ? |
|---|---|
| Unique (%) | 11.6% |
Sample
| 1st row | Scholastic Inc. |
|---|---|
| 2nd row | Scholastic Inc. |
| 3rd row | Scholastic |
| 4th row | Scholastic Inc. |
| 5th row | Scholastic |
| Value | Count | Frequency (%) |
| books | 2302 | 9.3% |
| press | 1314 | 5.3% |
| penguin | 598 | 2.4% |
| university | 551 | 2.2% |
| publishing | 511 | 2.1% |
| vintage | 409 | 1.6% |
| 352 | 1.4% | |
| classics | 344 | 1.4% |
| company | 331 | 1.3% |
| house | 319 | 1.3% |
| Other values (2015) | 17764 |
Most occurring characters
| Value | Count | Frequency (%) |
| 14210 | 8.4% | |
| o | 12961 | 7.7% |
| e | 12943 | 7.6% |
| s | 11905 | 7.0% |
| r | 11776 | 7.0% |
| i | 10684 | 6.3% |
| a | 10474 | 6.2% |
| n | 10454 | 6.2% |
| l | 6207 | 3.7% |
| t | 5814 | 3.4% |
| Other values (129) | 61938 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 169366 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 14210 | 8.4% | |
| o | 12961 | 7.7% |
| e | 12943 | 7.6% |
| s | 11905 | 7.0% |
| r | 11776 | 7.0% |
| i | 10684 | 6.3% |
| a | 10474 | 6.2% |
| n | 10454 | 6.2% |
| l | 6207 | 3.7% |
| t | 5814 | 3.4% |
| Other values (129) | 61938 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 169366 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 14210 | 8.4% | |
| o | 12961 | 7.7% |
| e | 12943 | 7.6% |
| s | 11905 | 7.0% |
| r | 11776 | 7.0% |
| i | 10684 | 6.3% |
| a | 10474 | 6.2% |
| n | 10454 | 6.2% |
| l | 6207 | 3.7% |
| t | 5814 | 3.4% |
| Other values (129) | 61938 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 169366 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 14210 | 8.4% | |
| o | 12961 | 7.7% |
| e | 12943 | 7.6% |
| s | 11905 | 7.0% |
| r | 11776 | 7.0% |
| i | 10684 | 6.3% |
| a | 10474 | 6.2% |
| n | 10454 | 6.2% |
| l | 6207 | 3.7% |
| t | 5814 | 3.4% |
| Other values (129) | 61938 |
Interactions
Correlations
| num_pages | average_rating | bookID | isbn13 | language_code | ratings_count | text_reviews_count | |
|---|---|---|---|---|---|---|---|
| num_pages | 1.000 | 0.110 | -0.010 | -0.138 | 0.000 | 0.185 | 0.168 |
| average_rating | 0.110 | 1.000 | -0.037 | 0.054 | 0.102 | 0.086 | 0.032 |
| bookID | -0.010 | -0.037 | 1.000 | 0.041 | 0.050 | -0.099 | -0.112 |
| isbn13 | -0.138 | 0.054 | 0.041 | 1.000 | 0.000 | -0.252 | -0.264 |
| language_code | 0.000 | 0.102 | 0.050 | 0.000 | 1.000 | 0.000 | 0.000 |
| ratings_count | 0.185 | 0.086 | -0.099 | -0.252 | 0.000 | 1.000 | 0.959 |
| text_reviews_count | 0.168 | 0.032 | -0.112 | -0.264 | 0.000 | 0.959 | 1.000 |
Missing values
Sample
| bookID | title | authors | average_rating | isbn | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | publication_date | publisher | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Harry Potter and the Half-Blood Prince (Harry Potter #6) | J.K. Rowling/Mary GrandPré | 4.57 | 0439785960 | 9780439785969 | eng | 652 | 2095690 | 27591 | 9/16/2006 | Scholastic Inc. |
| 1 | 2 | Harry Potter and the Order of the Phoenix (Harry Potter #5) | J.K. Rowling/Mary GrandPré | 4.49 | 0439358078 | 9780439358071 | eng | 870 | 2153167 | 29221 | 9/1/2004 | Scholastic Inc. |
| 2 | 4 | Harry Potter and the Chamber of Secrets (Harry Potter #2) | J.K. Rowling | 4.42 | 0439554896 | 9780439554893 | eng | 352 | 6333 | 244 | 11/1/2003 | Scholastic |
| 3 | 5 | Harry Potter and the Prisoner of Azkaban (Harry Potter #3) | J.K. Rowling/Mary GrandPré | 4.56 | 043965548X | 9780439655484 | eng | 435 | 2339585 | 36325 | 5/1/2004 | Scholastic Inc. |
| 4 | 8 | Harry Potter Boxed Set Books 1-5 (Harry Potter #1-5) | J.K. Rowling/Mary GrandPré | 4.78 | 0439682584 | 9780439682589 | eng | 2690 | 41428 | 164 | 9/13/2004 | Scholastic |
| 5 | 9 | Unauthorized Harry Potter Book Seven News: "Half-Blood Prince" Analysis and Speculation | W. Frederick Zimmerman | 3.74 | 0976540606 | 9780976540601 | en-US | 152 | 19 | 1 | 4/26/2005 | Nimble Books |
| 6 | 10 | Harry Potter Collection (Harry Potter #1-6) | J.K. Rowling | 4.73 | 0439827604 | 9780439827607 | eng | 3342 | 28242 | 808 | 9/12/2005 | Scholastic |
| 7 | 12 | The Ultimate Hitchhiker's Guide: Five Complete Novels and One Story (Hitchhiker's Guide to the Galaxy #1-5) | Douglas Adams | 4.38 | 0517226952 | 9780517226957 | eng | 815 | 3628 | 254 | 11/1/2005 | Gramercy Books |
| 8 | 13 | The Ultimate Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy #1-5) | Douglas Adams | 4.38 | 0345453743 | 9780345453747 | eng | 815 | 249558 | 4080 | 4/30/2002 | Del Rey Books |
| 9 | 14 | The Hitchhiker's Guide to the Galaxy (Hitchhiker's Guide to the Galaxy #1) | Douglas Adams | 4.22 | 1400052920 | 9781400052929 | eng | 215 | 4930 | 460 | 8/3/2004 | Crown |
| bookID | title | authors | average_rating | isbn | isbn13 | language_code | num_pages | ratings_count | text_reviews_count | publication_date | publisher | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11113 | 45617 | O Cavalo e o Seu Rapaz (As Crónicas de Nárnia #3) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 3.92 | 9722330551 | 9789722330558 | por | 160 | 207 | 16 | 8/15/2003 | Editorial Presença |
| 11114 | 45623 | O Sobrinho do Mágico (As Crónicas de Nárnia #1) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 4.04 | 9722329987 | 9789722329989 | por | 147 | 396 | 37 | 4/8/2003 | Editorial Presença |
| 11115 | 45625 | A Viagem do Caminheiro da Alvorada (As Crónicas de Nárnia #5) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 4.09 | 9722331329 | 9789722331326 | por | 176 | 161 | 14 | 9/1/2004 | Editorial Presença |
| 11116 | 45626 | O PrÃncipe Caspian (As Crónicas de Nárnia #4) | C.S. Lewis/Pauline Baynes/Ana Falcão Bastos | 3.97 | 9722330977 | 9789722330978 | por | 160 | 215 | 11 | 10/11/2003 | Editorial Presença |
| 11117 | 45630 | Whores for Gloria | William T. Vollmann | 3.69 | 0140231579 | 9780140231571 | en-US | 160 | 932 | 111 | 2/1/1994 | Penguin Books |
| 11118 | 45631 | Expelled from Eden: A William T. Vollmann Reader | William T. Vollmann/Larry McCaffery/Michael Hemmingson | 4.06 | 1560254416 | 9781560254416 | eng | 512 | 156 | 20 | 12/21/2004 | Da Capo Press |
| 11119 | 45633 | You Bright and Risen Angels | William T. Vollmann | 4.08 | 0140110879 | 9780140110876 | eng | 635 | 783 | 56 | 12/1/1988 | Penguin Books |
| 11120 | 45634 | The Ice-Shirt (Seven Dreams #1) | William T. Vollmann | 3.96 | 0140131965 | 9780140131963 | eng | 415 | 820 | 95 | 8/1/1993 | Penguin Books |
| 11121 | 45639 | Poor People | William T. Vollmann | 3.72 | 0060878827 | 9780060878825 | eng | 434 | 769 | 139 | 2/27/2007 | Ecco |
| 11122 | 45641 | Las aventuras de Tom Sawyer | Mark Twain | 3.91 | 8497646983 | 9788497646987 | spa | 272 | 113 | 12 | 5/28/2006 | Edimat Libros |